{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Vector Store Index usage examples\n",
    "\n",
    "In this guide, we show how to use the vector store index with different vector store\n",
    "implementations.  \n",
    " \n",
    "From how to get started with few lines of code with the default\n",
    "in-memory vector store with default query configuration, to using a custom hosted vector\n",
    "store, with advanced settings such as metadata filters.\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Construct vector store and index\n",
    "**Default**\n",
    "\n",
    "By default, `VectorStoreIndex` uses a in-memory `SimpleVectorStore`\n",
    "that's initialized as part of the default storage context."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index import VectorStoreIndex, SimpleDirectoryReader\n",
    "\n",
    "# Load documents and build index\n",
    "documents = SimpleDirectoryReader(\n",
    "    \"../../examples/data/paul_graham\"\n",
    ").load_data()\n",
    "index = VectorStoreIndex.from_documents(documents)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "**Custom vector stores**\n",
    "\n",
    "You can use a custom vector store (in this case `PineconeVectorStore`) as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pinecone\n",
    "from llama_index import VectorStoreIndex, SimpleDirectoryReader, StorageContext\n",
    "from llama_index.vector_stores import PineconeVectorStore\n",
    "\n",
    "# init pinecone\n",
    "pinecone.init(api_key=\"<api_key>\", environment=\"<environment>\")\n",
    "pinecone.create_index(\n",
    "    \"quickstart\", dimension=1536, metric=\"euclidean\", pod_type=\"p1\"\n",
    ")\n",
    "\n",
    "# construct vector store and customize storage context\n",
    "storage_context = StorageContext.from_defaults(\n",
    "    vector_store=PineconeVectorStore(pinecone.Index(\"quickstart\"))\n",
    ")\n",
    "\n",
    "# Load documents and build index\n",
    "documents = SimpleDirectoryReader(\n",
    "    \"../../examples/data/paul_graham\"\n",
    ").load_data()\n",
    "index = VectorStoreIndex.from_documents(\n",
    "    documents, storage_context=storage_context\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "For more examples of how to initialize different vector stores, \n",
    "see [Vector Store Integrations](/community/integrations/vector_stores.md)."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Connect to external vector stores (with existing embeddings)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you have already computed embeddings and dumped them into an external vector store (e.g. Pinecone, Chroma), you can use it with LlamaIndex by:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "vector_store = PineconeVectorStore(pinecone.Index(\"quickstart\"))\n",
    "index = VectorStoreIndex.from_vector_store(vector_store=vector_store)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "### Query\n",
    "**Default**  \n",
    "\n",
    "You can start querying by getting the default query engine:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "query_engine = index.as_query_engine()\n",
    "response = query_engine.query(\"What did the author do growing up?\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "**Configure standard query setting**  \n",
    "\n",
    "To configure query settings, you can directly pass it as\n",
    "keyword args when building the query engine: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.vector_stores.types import ExactMatchFilter, MetadataFilters\n",
    "\n",
    "query_engine = index.as_query_engine(\n",
    "    similarity_top_k=3,\n",
    "    vector_store_query_mode=\"default\",\n",
    "    filters=MetadataFilters(\n",
    "        filters=[\n",
    "            ExactMatchFilter(key=\"name\", value=\"paul graham\"),\n",
    "        ]\n",
    "    ),\n",
    "    alpha=None,\n",
    "    doc_ids=None,\n",
    ")\n",
    "response = query_engine.query(\"what did the author do growing up?\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that metadata filtering is applied against metadata specified in `Node.metadata`."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Alternatively, if you are using the lower-level compositional API:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index import get_response_synthesizer\n",
    "from llama_index.indices.vector_store.retrievers import VectorIndexRetriever\n",
    "from llama_index.query_engine.retriever_query_engine import (\n",
    "    RetrieverQueryEngine,\n",
    ")\n",
    "\n",
    "# build retriever\n",
    "retriever = VectorIndexRetriever(\n",
    "    index=index,\n",
    "    similarity_top_k=3,\n",
    "    vector_store_query_mode=\"default\",\n",
    "    filters=[ExactMatchFilter(key=\"name\", value=\"paul graham\")],\n",
    "    alpha=None,\n",
    "    doc_ids=None,\n",
    ")\n",
    "\n",
    "# build query engine\n",
    "query_engine = RetrieverQueryEngine(\n",
    "    retriever=retriever, response_synthesizer=get_response_synthesizer()\n",
    ")\n",
    "\n",
    "# query\n",
    "response = query_engine.query(\"what did the author do growing up?\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Configure vector store specific keyword arguments**  \n",
    "\n",
    "You can customize keyword arguments unique to a specific vector store implementation as well by passing in `vector_store_kwargs`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "query_engine = index.as_query_engine(\n",
    "    similarity_top_k=3,\n",
    "    # only works for pinecone\n",
    "    vector_store_kwargs={\n",
    "        \"filter\": {\"name\": \"paul graham\"},\n",
    "    },\n",
    ")\n",
    "response = query_engine.query(\"what did the author do growing up?\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Use an auto retriever**\n",
    "\n",
    "You can also use an LLM to automatically decide query setting for you! \n",
    "Right now, we support automatically setting exact match metadata filters and top k parameters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index import get_response_synthesizer\n",
    "from llama_index.indices.vector_store.retrievers import (\n",
    "    VectorIndexAutoRetriever,\n",
    ")\n",
    "from llama_index.query_engine.retriever_query_engine import (\n",
    "    RetrieverQueryEngine,\n",
    ")\n",
    "from llama_index.vector_stores.types import MetadataInfo, VectorStoreInfo\n",
    "\n",
    "\n",
    "vector_store_info = VectorStoreInfo(\n",
    "    content_info=\"brief biography of celebrities\",\n",
    "    metadata_info=[\n",
    "        MetadataInfo(\n",
    "            name=\"category\",\n",
    "            type=\"str\",\n",
    "            description=\"Category of the celebrity, one of [Sports, Entertainment, Business, Music]\",\n",
    "        ),\n",
    "        MetadataInfo(\n",
    "            name=\"country\",\n",
    "            type=\"str\",\n",
    "            description=\"Country of the celebrity, one of [United States, Barbados, Portugal]\",\n",
    "        ),\n",
    "    ],\n",
    ")\n",
    "\n",
    "# build retriever\n",
    "retriever = VectorIndexAutoRetriever(\n",
    "    index, vector_store_info=vector_store_info\n",
    ")\n",
    "\n",
    "# build query engine\n",
    "query_engine = RetrieverQueryEngine(\n",
    "    retriever=retriever, response_synthesizer=get_response_synthesizer()\n",
    ")\n",
    "\n",
    "# query\n",
    "response = query_engine.query(\n",
    "    \"Tell me about two celebrities from United States\"\n",
    ")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "llama",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
